Mixture modeling of microarray gene expression data
نویسندگان
چکیده
About 28% of genes appear to have an expression pattern that follows a mixture distribution. We use first- and second-order partial correlation coefficients to identify trios and quartets of non-sex-linked genes that are highly associated and that are also mixtures. We identified 18 trio and 35 quartet mixtures and evaluated their mixture distribution concordance. Concordance was defined as the proportion of observations that simultaneously fall in the component with the higher mean or simultaneously in the component with the lower mean based on their Bayesian posterior probabilities. These trios and quartets have a concordance rate greater than 80%. There are 33 genes involved in these trios and quartets. A factor analysis with varimax rotation identifies three gene groups based on their factor loadings. One group of 18 genes has a concordance rate of 56.7%, another group of 8 genes has a concordance rate of 60.8%, and a third group of 7 genes has a concordance rate of 69.6%. Each of these rates is highly significant, suggesting that there may be strong biological underpinnings for the mixture mechanisms of these genes. Bayesian factor screening confirms this hypothesis by identifying six single-nucleotide polymorphisms that are significantly associated with the expression phenotypes of the five most concordant genes in the first group.
منابع مشابه
Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملDNA Microarrays and Gene Expression - From Experiments to Data Analysis and Modeling
dna microarrays and gene expression assets dna microarrays and gene expression from experiments to dna microarrays and gene expression: from experiments to dna microarrays and gene expression dna microarrays and gene expressionfrom experiments to dna microarrays and gene expression: from experiments to dna microarrays and computational analysis final sln a4.... microarray data integration and t...
متن کاملGene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method
Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...
متن کاملIntegration and Reduction of Microarray Gene Expressions Using an Information Theory Approach
The DNA microarray is an important technique that allows researchers to analyze many gene expression data in parallel. Although the data can be more significant if they come out of separate experiments, one of the most challenging phases in the microarray context is the integration of separate expression level datasets that have gathered through different techniques. In this paper, we prese...
متن کاملAssessment of reliability of microarray data and estimation of signal thresholds using mixture modeling.
DNA microarray is an important tool for the study of gene activities but the resultant data consisting of thousands of points are error-prone. A serious limitation in microarray analysis is the unreliability of the data generated from low signal intensities. Such data may produce erroneous gene expression ratios and cause unnecessary validation or post-analysis follow-up tasks. In this study, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- BMC Proceedings
دوره 1 شماره
صفحات -
تاریخ انتشار 2007